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of different DNA and RNA sequences. Probes of this type are wulely used to detectthe 

LL acids corresponding to specific genes, both to facilitate the punficmon and 

charactewtionofthegenesaftercelllysisand^ 

organisms. Moreover, by carrying out hybridization reacnons ^^f^J- 
'•reduced stringency, " a probe prepared from one gene can be used tofindmevolu 
^rZives^both^ 

famZ and in other organisms, where the evolutionary history of the nucleotide se- 

que nee can be traced. 



Figure 7-20 In situ hybridization ^ 
RNA localization in tissues. 
Autoradiograph of a section of a very 
young Drosophila embryo that has 
been subjected to in situ 
hybridization using a radioactive D 
probe complementary to a gene 
involved in segment development. 
The probe has hybridized to RNA in 
the embryo, and the pattern of 
autoradiographic silver grains reveals 
that the RNA made by the gene 
(called ftz) is localized in alternating 
stripes across the embryo that are , 
three or four cells wide. At this stage 
of development (cellular blastoderm); 
the embryo contains about 6000 cells; 
(From E. Hafen, A. Kuriowa, and WJ. 
Gehring, Cell 37:833-841, 1984.© 
Press.) 



DNA Cloning 15 

In DNA cloning, a DNA fragment that contains a gene of interest is inserted into 
L purified DNA genome of a self-replicating genetic element-generally * «■ 
rus or a plasmid. A DNA fragment containing a human gene, for example, can be 
£med in a test tube to the chromosome of a bacterial virus, and the new recorn^ 
Mnl DNA molecule can then be introduced into a bacterial cell Starting ; with 
only one such recombinant DNA molecule that infects a single cell, the normal 
e icationme^^ 

DNA molecules in less than a day, thereby amplifying the amount of the in^rted 
human DNA fragment by the same factor. A virus or plasmid used in this way is 
known as a cloning vector, and the DNA propagated by insert.on into it is said to 
have been cloned. 

A DNA Library Can Be Made Using Either 
Viral or Plasmid Vectors 16 

in order to clone a specific gene, one begins by constructing *J^™^ 
comprehensive collection of cloned DNA fragments, including (one hopes) at 
eaTt one fragment that contains the gene of interest. The library can be con^ 
stmcted using either a virus or a plasmid vector and is generally housed m a 
population ofbacteria. cells. The principles underlying ^j^™** 
cloning genes are the same for either type of cloning vector, although the details 
may be different. For simplicity, in this chapter we ignore these differences and 
illustrate the methods with reference to plasmid vectors m . rnl „ of 

The plasmid vectors used for gene cloning are small circular molecules ot 
double-sttanded DNA derived from larger plasmids that occur naturaUy ,n bac- 
terial cells. They generally account for only a minor fraction of the total host 
ba t S cell DN/vfbut they can easily be separated on the basis of their small sue 
from chromosomal DNA molecules, which are large and precipitate , » . p*U* 
upon centrifugation. For use as cloning vectors, the purified P^ 1 *? 
are first cut with a restriction nuclease to create linear DNA molecules. The cel- 
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r d icu)a f P ,asmid 
*DNA molecule 



linear plasmid DNA 
molecule with cohesive ends 




TT A A 

one of many DNA fragments produced 
by cutting chromosomal DNA with 
the same restriction nuclease 



r jular DNA to be used in constructing the library is cut with the same restriction 
nuclease, and the resulting restriction fragments (including those containing the 
i gene to be cloned) are then added to the cut plasmids and annealed via their 
|: cohesive ends to form recombinant DNA circles. These recombinant molecules 
ffldntairiing foreign DNA inserts are then covalently sealed with the enzyme DNA 
%}gase (Figure7-2i). 

^ ^ In the next step in preparing the library, the recombinant DNA circles are 
| introduced into bacterial cells that have been made transiently permeable to 
pDNA; such cells are said to be transfected with the plasmids. As these cells grow 
|§|nd divide, doubling in number every 30 minutes, the recombinant plasmids also 
Implicate to produce an enormous number of copies of DNA circles containing 
Jpthe foreign DNA (Figure 7-22). Many bacterial plasmids carry genes for antibi- 
i|||ic resistance, a property that can be exploited to select those cells that have 
t ;been successfully transfected; if the bacteria are grown in the presence of the 
H|£ntibiotic, only cells containing plasmids will survive. Each original bacterial cell 
pthat was initially transfected will, in general, contain a different foreign DNA 
ft insert; this insert will be inherited by all of the progeny cells of that bacterium, 
which together form a small colony in a culture dish. 

The mixture of many different surviving bacteria contains the DNA library, 
. corn of a large number of different DNA inserts. The problem is that only 
| few of the bacteria will harbor the particular recombinant plasmids that con- 
}. tointhe desired gene. One needs to be able to identify these rare ceils in order 
'to recover the DNA of interest in pure form and in useful quantities. Before dis- 
cussing how this is achieved, we need to describe a second strategy for generating 
a DNA library that is commonly used in gene cloning. 

Two Types of DNA Libraries Serve Different Purposes 17 

Gleaving the entire genome of a cell with a specific restriction nuclease as just 
described is sometimes called the "shotgun" approach to gene cloning. It pro- 
duces a very large number of DNA fragments—on the order of a million for a 
^mmalian genome— which will generate millions of different colonies of trans- 
ited bacterial cells. Each of these colonies will be composed of a clone derived 
a single ancestor cell and therefore will harbor a recombinant plasmid with 
sesame inserted genomic DNA sequence. Such a plasmid is said to contain a 
gnomic DNA clone, and the entire collection of plasmids is said to constitute 



Figure 7-22 Purification and amplification of a specific DNA sequence by 
DNA cloning in a bacterium. Each bacterial cell carrying a recombinant 
plasmid develops into a colony of identical ceils, visible as a spot on the 
nutrient agar. By inoculating a single colony of interest into a liquid culture, 
one can obtain a large number of identical plasmid DNA molecules, each 
containing the same DNA insert. 



plasmid DNA molecule 
containing chromosomal 
DNA insert 



Figure 7-2 1 The formation of a 
recombinant DNA molecule. The 

cohesive ends produced by many 
kinds of restriction nucleases allow 
two DNA fragments to join by 
complementary base-pairing (see 
Figure 7-2). DNA fragments joined in 
this way can be covalently linked in a 
highly efficient reaction catalyzed by 
the enzyme DNA ligase. In this 
example a recombinant plasmid DNA 
molecule containing a chromosomal 
DNA insert is formed. 



many plasmids each containing 
a different DNA insert, one of 
which is of interest 
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Figure 7-23 The synthesis of cDNa. 

A DNA copy (cDNA) of an mRNA 
molecule is produced by the enzym e 
reverse transcriptase (see p. 282), 
thereby forming a DNA/ RNA hybrid 
helix. Treating the DNA/ RNA hybrid 
with alkali selectively degrades the 
RNA strand into nucleotides. The 
remaining single-stranded cDNAis 
then copied into double-stranded 
cDNA by the enzyme DNA 
polymerase. As indicated, both 
reverse transcriptase and DNA 
polymerase require a primer to begin 
their synthesis. For reverse 
transcriptase a small oligonucleotide 
is used; in this example oligo(dT) has 
been annealed with the long poly-A 
tract at the 3' end of most mRNAs. 
Note that the double- stranded cDNA 
molecule produced here lacks 
cohesive ends; such blunt- e nded DNfl 
molecules can be cloned by one of 
several procedures that are analogous 
to (but less efficient than) that shown 
in Figure 7-21. 



a genomic DNA library. But because the genomic DNA is cut into fragments at 
random, only some fragments will contain genes; many will contain only a por- 
tion of a gene, while most of the genomic DNA clones obtained from the DNA 
of a higher eucaryotic cell will contain only noncoding DNA, which, as we shall 
discuss in Chapter 8, makes up most of the DNA in such genomes. 

An alternative strategy is to begin the cloning process by selecting only those 
DNA sequences that are transcribed into RNA and thus are presumed to corre- 
spond to genes. This is done by extracting the mRNA (or a purified subfraction 
of the mRNA) from cells and then making a complementary DNA (cDNA) copy 
of each mRNA molecule present; this reaction is catalyzed by the reverse tran- 
scriptase enzyme of retroviruses, which synthesizes a DNA chain on an RNA tem- 
plate The single-stranded DNA molecules synthesized by the reverse tran- 
scriptase are converted into double-stranded DNA molecules by DNA 
polymerase, and these molecules are inserted into a plasmid or virus vector and 
cloned (Figure 7-23). Each clone obtained in this way is called a cDNA clone, and 
the entire collection of clones derived from one mRNA preparation constitutes 

a cDN A library. , nMA 

There are important differences between genomic DNA clones and cDNA 
clones, as illustrated in Figure 7-24. Genomic clones represent a random sample 
of all of the DNA sequences in an organism and, with very rare exceptions, will 
be the same regardless of the cell type used to prepare them. By contrast, cDNA 
clones contain only those regions of the genome that have been transcribed into 
mRNA- as the cells of different tissues produce distinct sets of mRNA molecules, 
a different cDNA library will be obtained for each type of cell used to prepare the 
library. 

cDNA Clones Contain Uninterrupted Coding Sequences 18 

The use of a cDNA library for gene cloning has several advantages. First, some 
proteins are produced in very large quantities by specialized cells. In this case, 
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Figure 7-24 The differences between 
cDNA clones and genomic DNA 
clones. In this example gene A is 
infrequently transcribed while gene B 
is frequently transcribed, and both 
genes contain introns {green). In the 
genomic DNA clones both the introns 
and the nontranscribed DNA are 
included, and most clones will 
contain only part of the coding 
sequence of a gene. In the cDNA 
clones the intron sequences have 
been removed by RNA splicing during 
the formation of the mRNA, and a 
continuous coding sequence is 
therefore present. 



genomic DNA clones 



cDNA clones 



jphe mRNA encoding the protein is likely to be produced in such large quantities 
^that a cDNA library prepared from the cells will be highly enriched for the cDNA 
Ifmolecules encoding the protein, greatly reducing the problem of identifying the 
p|lesired clone in the library (see Figure 7-24). Hemoglobin, for example, is made 
|S large amounts by developing erythrocytes (red blood cells); for this reason the 
lin genes were among the first to be cloned. 

By far the most important advantage of cDNA clones is that they contain the 
Hhinterrupted coding sequence of a gene. Eucaryotic genes usually consist of 
jdhort coding sequences of DNA (exons) separated by longer noncoding sequences 
Jntrons); the production of mRNA entails the removal of the noncoding se- 
quences from the initial RNA transcript and the splicing together of the coding 
^uences. Neither bacterial nor yeast cells will make these modifications to the 
JRNA produced from a gene of a higher eucaryotic cell. Thus, if the aim of the 
gdoning is either to deduce the amino acid sequence of the protein from the DNA 
ijjrto produce the protein in bulk by expressing the cloned gene in a bacterial or 
;pst cell, it is much preferable to start with cDNA. 

^j: Genomic and cDNA libraries are inexhaustible resources that are widely 
[1^^ amon g investigators. Today, many such libraries are also available from 
-•^ommercial sources. 

#NA Libraries Can Be Prepared from Selected 
ppulations of mRNA Molecules 19 

|jj|fen cDNAs are prepared from cells that express the gene of interest at ex- 
3^tnely high levels, the majority of cDNA clones may contain the gene sequence, 
p^ich can therefore be selected with minimal effort. For less abundantly tran- 
Ifo ^ 6nes ' various methods can be used to enrich for particular mRNAs be- 
J to making tne cDNA library. If an antibody against the protein is available, for 
I B7 2 lt ^ nSe< * t0 P reci P itate selectively those polyribosomes (see pp. 
Sin h ^ aVe ^ e a PP r0 P riate g rowin g polypeptide chains attached to them. 
■ ^ ese Polyribosomes will also have attached to them the mRNA coding for 
lOOOf*!™' ^ precipitate may be enricned in me desired mRNA by as much as 
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Subtractiv hybridization provides a powerful alternative way of enriching t 
for particular nucleotide sequences prior to cDNA cloning. This selection proce- 
dure can be used, for example, if two closely related cell types are available from 
the same organism, only one of which produces the protein or proteins of inter- 
est. It was first used to identify cell-surface receptor proteins present on T lym- 
phocytes but not on B lymphocytes. It can also be used wherever a cell that ex- 
presses the protein has a mutant counterpart that does not. The first step is to 
synthesize cDNA molecules using the mRNA from the cell type that makes the 
protein of interest. These cDNAs are then hybridized with a large excess of mRNA 
molecules from the second ceil type. Those rare cDNA sequences that fail to find 
a complementary mRNA partner are likely to represent mRNA sequences present 
only in the first cell type. Because these cDNAs remain unpaired after the hybrid- 
ization, they can be purified by a simple biochemical procedure (a hydroxyapatite 
column) that separates single-stranded from double-stranded nucleic acids (Fig- 
ure 7-25). Besides providing a powerful way to clone genes whose products are 
known to be restricted to a specific differentiated cell type, cDNA libraries pre- 
pared after subtractive hybridization are useful for defining the differences in 
gene expression between any two related types of cells. 

Tpst for Expressed Protein Can Be 
Used to Identify the Clones of Interest in a DNA Library 

The most difficult part of gene cloning is often the identification of the rare colo- 
nies in the library that contain the DNA fragment of interest. This is especially 
true in the case of a genomic library, where one has to identify one bacterial cell 
in a million to select a specific mammalian gene. The technique most frequently 
used is a form of in situ hybridization that takes advantage of the exquisite speci- 
ficity of the base-pairing interactions between two complementary nucleic acid 
molecules. Culture dishes containing the growing bacterial colonies are blotted 
with a piece of filter paper, to which some members of each bacterial colony 
adhere. The adhering colonies, known as replicas, are treated with alkali to dis- 
rupt the cells and to separate the strands of their DNA molecules; the paper is 
then incubated with either a radioactive or a chemically labeled DNA probe con- 
taining part of the sequence of the gene being sought (Figure 7-26). If necessary, 
millions of bacterial clones can be screened in this way to find the one clone that 
hybridizes with the probe. 

In order to find the clone of interest, a specific probe must be made. How this 
is done will depend on the information that is available about the gene to be 
cloned. In many cases the protein of interest has been identified by biochemi- 
cal studies and purified in small amounts. Only a few micrograms of pure pro- 
tein are often enough to determine the sequence of 30 or so amino acid residues. 
From this amino acid sequence the corresponding nucleotide sequence can be 
deduced using the genetic code (with some ambiguities corresponding to amino 
acids that can be represented by several alternative codons). Two sets of DNA oli- 
gonucleotides, chosen to match different parts of the predicted nucleotide se- 
quence of the gene, are then synthesized by chemical methods (Figure 7-27). 
Colonies of cells that hybridize with both sets of DNA probes are strong candi- 
dates for containing the desired gene and are saved for further characterization 
(see below). 

Probes can also be obtained in other ways. If an antibody is available that 
recognizes the protein produced by the gene, it can be labeled and used as a 
probe to find a clone that is producing the protein, which therefore contains the 
desired gene. Any other ligand that is known to bind to the protein encoded by 
the gene can also be used as a probe: if the gene encodes a receptor protein, for 
example, the ligand that normally binds to the receptor can, in principle, be used 
as a probe. 
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Figure 7-25 Subtractive > 
hybridization. In this example the 1 
technique is used to purify rare cDNA 
clones corresponding to mRNA 
molecules present in T lymphocytes ? 
but not in B lymphocytes. Because 
the two cell types are very closely 
related, most of the mRNAs will be 
common to both cell types; 
subtractive hybridization is thus a 
powerful way to enrich for those ^ 
specialized molecules that distinguis 
the two cells. 
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Figure 7-26 An efficient technique commonly used to detect a bacterial 
colony carrying a particular DNA clone. A replica of the culture is made by 
pressing a piece of absorbent paper against the surface. This replica is 
treated with alkali (to disrupt the cells and denature the plasmid DNA) and 
then hybridized to a highly radioactive DNA probe. Those bacterial colonies 
that have bound the probe are identified by autoradiography. (See also 
Figure 7-22.) 



Whenever the protein product of a gene is to be detected rather than the gene 
ilf, a special type of cDNA library is required. It is prepared in a special plas- 
ffmid or virus called an expression vector, which directs the transfected bacterium 
fto synthesize large amounts of the protein encoded by the foreign DNA insert 
^contained within the vector's DNA, as we shall discuss later. 



4**^ <■' 

5In Vitro Translation Facilitates Identification 
$f the Correct DNA Clone 21 

;Any method that is used to find a specific clone from a cDNA or genomic DNA 
*Hbrary will usually pick out many false positive clones. Further ingenuity is re- 
|l^ ed t0 discriminate between these and the authentic clones desired. The task 
t |s easiest when the desired clone encodes a protein that has already been char- 
acterized by other means. In this case each candidate DNA can be tested by one 
of several methods for its ability to encode the appropriate protein. The cloned 
DNA can be inserted into an expression vector, for example, so that the protein 
that it encodes is produced in large amounts in a bacterium. Alternatively, the 
cloned DNA can be used to obtain a corresponding RNA molecule, either through 
w vitro synthesis with a purified RNA polymerase (see Figure 7-36) or by a tech- 
Nque called hybrid selection. In the latter method a mixture of cellular RNAs is 
added to an excess of single strands of the candidate DNA, and DNA/ RNA hybrid- 
ation is used to purify complementary mRNA molecules from the mixture. In 
ei fcer case the mRNA obtained is allowed to direct protein synthesis in a cell-free 
system using radioactive amino acids, and the radioactive protein produced is 
e n characterized and compared with the expected protein product of the de- 
stred clone. A match in any of these tests allows one to conclude that a cloned 
NA fragment encodes the correct protein. 
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known portion of amino acid sequence 

H2N---B1 essi n M 



possible codons 

5' GGA GUA AGA AUG GAC UGG AAC UAC GAA CCA UUA AGC ACA UGG GAA AUG AAC CAA UGG 
GGC GUC AGG GAU AAU UAU GAG CCC UUG AGU ACC GAG AAU CAG 
GGG GUG CGA CCG CUA UCA ACG 

GGU GUU CGC CCU CUC UCC ACU 

CGG CUG UCG 

CGU CUU UCU 

regions of coding 

sequence with I 

least ambiguity 



UUC GUA AGA 
UUU GUC AGG 
GUG CGA 
GUU CGC 
CGG 
CGU 



GCA 3. 
GCC 
GCG 
GCU 



synthetic 
oligonucleotides 
used as probes 



AUGGAyUGGAAyUAyGAQCC 
(16 possibilities) 



UGGGA G AUGAA[jCAgUGGUU 



(8 possibilities) 



The Selection of Overlapping DNA Clones Allows One 
to "Walk" Along the Chromosome to a Nearby Gene 
of Interest 22 

Many of the most interesting genes— for example, those that control develop- 
ment—are known only from genetic analysis of mutants in such organisms as the 
fruit fly Drosophila and the nematode Caenorhabditis elegans. The protein prod- 
ucts of these genes are unknown and may be present in very small quantities in 
a few cells or produced only at one stage of development. A study of the genetic 
linkage between different mutations, however, can be used to generate chromo- 
some maps, which give the relative locations of the genes (see Figure 7-16). Once 
one mapped gene has been cloned, the clones in a genomic DNA library that 
correspond to neighboring genes can be identified using a technique called chro- 
mosome walking. The methods described in this chapter can then be used to 
deduce the exact structure and function of the gene of interest and the protein 
that it encodes. 

In chromosome walking one starts with a DNA clone corresponding to a gene 
or an RFLP marker that is known to be as close as possible to the gene of inter- 
est. One end of this clone is used to prepare a DNA probe, which is then used in 
DNA hybridization experiments to find an overlapping clone in a genomic DNA 
clone library. The DNA from this second DNA clone is purified, and its far end 
is used to prepare a second DNA probe, which is used to find a clone that is over- 
lapping, and so on. In this way one can walk along a chromosome one clone at 
a time, in steps of 30,000 base pairs or more in either direction (Figure 7-28). 

How does one know when the gene of interest (identified originally by a 
deleterious mutation) has been reached, given that the walk is generally too long 
for complete DNA sequencing to be practicable? For experimental organisms 
such as fruit flies, nematodes, Arabidopsis, yeast, and mice, the ultimate proof 
of the correct gene is to transfer the normal form of the gene (as a cloned DNA 
molecule) into a chromosome of the mutant organism, producing a transgenic 
organism (see Figures 7-45 and 7-49). If the original mutation was a recessive 
one, the correct DNA should reverse the original mutant phenotype. Other, 
less stringent criteria, however, are often used and are necessary in the case of 
human genes, as described later (see Figure 7-30). 

Ordered Genomic Clone Libraries Are Being Produced 
for Selected Organisms 23 

The whole task of identifying mutant genes should become vastly easier as knowl- 
edge of the sequence of the normal genome becomes more complete and sys- 
tematic. By using methods related to those described for chromosome walking, 
it has been possible to order (map) a complete, or nearly complete, set of large 



Figure 7-27 Selecting regions of a 
known amino acid sequence to mal 
synthetic oligonucleotide probes. 
Although only one nucleotide 
sequence will actually code for the 
protein, the degeneracy of the geneti 
code means that several different 
nucleotide sequences will give the 
same amino acid sequence, and it is 
impossible to tell in advance which i; 
the correct one. Because it is desirabi 
to have as large a fraction of the 
correct nucleotide sequence as 
possible in the mixture of 
oligonucleotides to be used as a 
probe, those regions with the fewest 
possibilities are chosen, as illustrated 
In this example the mixture of 8 
closely related oligonucleotides 
shown might be synthesized and use< 
to probe a clone library, and the 
indicated mixture of 16 
oligonucleotides would be used to 
reprobe all positive clones to find 
those that actually code for the 
desired protein. After the • 
oligonucleotide mixture is , 
synthesized by chemical means, the 5 
end of each oligonucleotide is 
radioactively labeled (see Figure 7- 
6B); alternatively, the probe can be . 
marked with a chemical label by ; 
incorporating a modified nucleotide ] 
" during its synthesis (see Figure 7-18). 
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genomic clones along the chromosomes of the E. coli bacterium, the yeast Sac- 
charomyces cerevisiae, the fruit fly Drosophila, the plant Arabidopsis, and the 
nematode C. elegans. Such large clones, each about 30,000 base pairs in length, 
are usually prepared in bacteriophage lambda vectors called cosmids, which are 
specially designed to accept only large DNA inserts. It takes a few thousand such 
clones to cover the entire genome of an organism such as C elegans or Dros- 
ophila. To map the entire human genome in this way would require ordering 
more than 100,000 clones in cosmids, which is very time consuming but techni- 
cally feasible. DNA fragments that are more than 10 times larger than these clones 
(300,000 to 1.5 million base pairs) can be cloned in yeast cells as YACs (yeast ar- 
tificial chromosomes) (Figure 7-29); in principle, the human genome could be 
represented as about 10,000 clones of this type (see Figure 8-5). 

In the near future, ordered sets of genomic clones will no doubt be available 
from centralized DNA libraries for use by all research workers. Eventually, a com- 
plete library will be available for each commonly studied organism, with each 
DNA fragment catalogued according to its chromosome of origin and numbered 
sequentially with respect to the positions of all other DNA fragments derived from 
the same chromosome. One will then begin a "chromosome walk" simply by 
obtaining from the library all the clones covering the region of the genome that 
contains the mutant gene of interest. 



Figure 7-28 The use of overlapping 
DNA clones to find a new gene by 
"chromosome walking." To speed up 
the walk, genomic libraries containing 
very large cloned DNA molecules are 
optimal. To probe for the next clone 
in the walk by DNA hybridization, a 
short DNA fragment (labeled with a 
chemical or a radioisotope) from one 
end of the previously identified clone 
is purified: If a "right-handed" end is 
used, for example, the walk will go in 
the "rightward" direction, as shown in 
this example. Use of a small end 
fragment as a probe also reduces the 
probability that the probe will contain 
a repeated DNA sequence that would 
hybridize with many clones from 
different parts of the genome and 
thereby interrupt the walk. 



Positional DNA Cloning Reveals Human Genes 
with Unanticipated Functions 24 

Thousands of human diseases are caused by alterations in single genes. Our 
.understanding of mese genetic diseases is being revolutionized by recombinant 
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Figure 7-29 Overlapping genomic 
DNA clones. The collection of clones 
shown covers a small region of a 
chromosome of the nematode worm 
Caenorhabditis elegans and 
represents 0.3% of the total genome. 
(Adapted from J. Sulston et al. t Nature 
356:37-41, 1992. © 1992 Macmillan 
Magazines Ltd.) 
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in these 1 million base pairs 
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STEP 4 



gene whose sequence is altered in those 
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STEP 5 



DNA methods, which allow the altered DNA to be cloned and sequenced reveal- 
ing the precise defect in each patient. In this way, for example Duchenne » 
muscular dystrophy was shown to be due to an abnormal cytoskeletal protein 
in muscle cells, and cystic fibrosis to be an abnormal chloride channel in epi- 
thelial cells. This knowledge not only improves the accuracy of diagnosis but also 
makes it possible, in principle at least, to design treatments. 

Although the techniques used to find human disease genes have d.ffered 
depending on the disease, a standard approach has recently been developed that 
makes u possible to isolate any human gene that is responsible by usel for a 
specific trait or disease. It is called positional cloning because i s arts™* ge- 
netic linkage mapping to locate the gene in the genome (F.gure 7-30). While the 
approach is straightforward, it presently requires 10 to 100 person-years ; to _ iso- 
late a gene in this way. It will become much easier once DNA sequencing is highly 
automated and the full DNA sequence of the human genome is known: genetic 
linkage mapping will reveal immediately which genes are prime suspects, and the 
sequences of these genes can then be analyzed directly in individual patients. The 
functions of thousands of human genes are likely to be identified in th.s way. 

Selected DNA Segments Can Be Cloned in a Test Tube 
by a Polymerase Chain Reaction 25 

The availability of purified DNA polymerases and chemically synthesized DNA 
oligonucleotides has made it possible to clone specific DNA sequences rapid y 
without the need for a living cell. The technique, called the polymerase chain 
reaction (PCR), allows the DNA from a selected region of a genome to be am- 
plified a billionfold, provided that at least part of its nucleotide sequence 1S al- 
ready known. First, the known part of the sequence is useu to design two syn- 
thetic DNA oligonucleotides, one complementary to each strand of the DNA 
double helix and lying on opposite sides of the region to be amplified ^ These 
oligonucleotides serve as primers for in vitro DNA synthesis, which is catalyzed 



Figure 7-30 Positional cloning. The 

procedure requires a mutant human 
gene whose inheritance can be traced 
in many family groups by virtue of the 
phenotype that the mutation causes. 
Step 1. genetic mapping: RFLP 
markers that are coinherited with the 
phenotype are identified and used to 
position the gene within about 10 6 
base pairs (one megabase, or about 
1% the length of a typical human 
chromosome). Step 2, assembly of an 
ordered clone library: genomic DNA 
clones are obtained that cover the 
entire region between two RFLP 
markers that bracket the gene. Step 3, 
search for conserved DNA sequences: 
the portions of each DNA clone that 
hybridize with mouse DNA are 
identified; only those regions of the 
human chromosome whose 
nucleotide sequence is important will 
have been sufficiently conserved 
during evolution to form such a 
hybrid DNA helix (one strand human 
and the other mouse). Step 4, search 
for appropriate mRNAs: the subset of 
conserved DNA sequences that 
encode an mRNA in tissues where the 
mutant phenotype is expressed are 
the most likely to represent the 
mutant gene. Step 5, finding a 
difference in the DNA sequence of 
mutated genes. When the deleterious 
mutations in a typical human gene 
are analyzed, about 1 in 10 turns out 
to be a deletion that is easily detected 
as a change in the size of a restriction 
fragment detected by Southern 
blotting. For this reason one generally 
begins by screening the DNA of many 
human patients with the same disease 
using probes identified in step 4 
looking for such a change. If the 
mutation is not detectable as a 
deletion, other, more laborious 
methods that are capable of detecting 
single base changes must be used to 
identify the gene of interest. 
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DNA polymerase, and they determine the ends of the final DNA fragment 
[ a Jis obtained (Figure 7-31). 
The principle of the PCR technique is illustrated in Figure 7-32. Each cycle 
| . e rea ction requires a brief heat treatment to separate the two strands of the 
^|? orn ic DNA double helix (step 1). The success of the technique depends on the 
feP 0 f a special DNA polymerase isolated from a thermophilic bacterium that is 
pl^te at much higher temperatures than normal, so that it is not denatured by 
jpjjg r epeated heat treatments. A subsequent cooling of the DNA in the presence 
^of a large excess of the two primer DNA oligonucleotides allows these oligo- 
^ nucleotides to hybridize to complementary sequences in the genomic DNA (step 
2) The annealed mixture is then incubated with DNA polymerase and the four 
^ teoxyribonucleoside triphosphates so that the regions of DNA downstream from 
each of the two primers are selectively synthesized (step 3). When the procedure 
! [ ^ repeated, the newly synthesized fragments serve as templates in their turn, and 
Within a few cycles the predominant product is a single species of DNA fragment 
whose length corresponds to the distance between the two original primers. In 
<practice, 20 to 30 cycles of reaction are required for effective DNA amplification. 
Each cycle doubles the amount of DNA synthesized in the previous cycle. A single 
| cycle requires only about 5 minutes, and an automated procedure permits "cell- 
TSftee molecular cloning" of a DNA fragment in a few hours, compared with the 
-several days required for standard cloning procedures. 

The PCR method is extremely sensitive; it can detect a single DNA molecule 
in a sample. Trace amounts of RNA can be analyzed in the same way by first tran- 
1 ascribing them into DNA with reverse transcriptase. The PCR cloning technique 
Jys rapidly replacing Southern blotting for the diagnosis of genetic diseases and 
S|for the detection of low levels of viral infection. It also has great promise in fo- 
i^fensic medicine as a means of analyzing minute traces of blood or other tissues — 
even as little as a single cell — and identifying the person from whom they came 
:~by his or her genetic "fingerprint" (Figure 7-33). 



Figure 7-32 PCR amplification. PCR produces an amount of DNA that 
doubles in each cycle of DNA synthesis and includes a uniquely sized DNA 
species. Three steps constitute each cycle, as described in the text. After 
many cycles of reaction, the population of DNA molecules becomes 
dominated by a single DNA fragment, X nucleotides long, provided that the 
original DNA sample contains the DNA sequence that was anticipated 
when the two oligonucleotides were designed. In the example illustrated, 
three cycles of reaction produce 16 DNA chains, 8 of which have this 
unique length {yellow); but after three more cycles, 240 of the 256 DNA 
chains would be X nucleotides long. 
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Figure 7-31 The start of the 
polymerase chain reaction (PCR) for 
amplifying specific nucleotide 
sequences in vitro. DNA isolated 
from cells is heated to separate its 
complementary strands. These 
strands are then annealed with an 
excess of two DNA oligonucleotides 
(each 15 to 20 nucleotides long) that 
have been chemically synthesized to 
match sequences separated by X 
nucleotides (where X is generally 
between 50 and 2000). The two 
oligonucleotides serve as specific 
primers for in vitro DNA synthesis 
catalyzed by DNA polymerase, which 
copies the DNA between the 
sequences corresponding to the two 
oligonucleotides. 
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Figure 7-33 The use of PCR i n 
forensic science. (A) A PCR reaction 
using two primers that bracket a 
particular microsatellite, or VNTR, 
sequence (see Figure 7-14C) pro- 
duces a different pair of DNA bands 
from each individual. One of these 
bands contains the repeated VNTR 
sequence that was inherited from 
the individual's mother and the 
other contains the repeated VNTR 
sequence that was inherited from the 
individual's father. (B) The large set of 
DNA bands obtained from a set of 
different PCR reactions, each of which 
amplifies the DNA from a different 
VNTR sequence, can serve as a 
"fingerprint" to identify each 
individual nearly uniquely. The 
starting material for the PCR reaction 
can be a single hair that was left at the 
scene of a crime. 



Summary 

DNA cloning allows a copy of any specific part of a DNA or RNA sequence to be se- 
lected from the millions of other sequences in a cell and produced in unlimited 
amounts in pure form. DNA sequences are amplified after cutting chromosomal DNA 
with a restriction nuclease and inserting the resulting DNA fragments into the chro- 
mosome of a self-replicating genetic element (a plasmid or a virus). When a plasmid 
vector is used, the resulting "genomic DNA library" is housed in millions of bacterial 
cells, each carrying a different cloned DNA fragment. The bacterial colony contain- 
ing a DNA fragment of interest is identified by hybridization using a DNA probe or, 
following expression of a cloned gene or gene fragment in the bacterial host cell, by 
using a test that detects the desired protein product The cells in the identified bac- 
terial colony are then allowed to proliferate, producing large amounts of the desired 
DNA fragment. 

The procedure used to obtain DNA clones that correspond in sequence to mRNA 
molecules are the same except that the starting material is a DNA copy of the mRNA 
sequence, called cDNA, rather than fragments of chromosomal DNA. Unlike genomic 
DNA clones, cDNA clones lack intron sequences, making them the clones of choice for 
expressing and characterizing the protein product of a gene. 

PCR is a new form of DNA cloning that is carried out outside cells using a pu- 
rified, thermostable DNA polymerase enzyme. Tnis type of DNA amplification requires 
a prior knowledge of gene sequence, since tivo synthetic oligonucleotide primers must 
be synthesized that bracket the DNA sequence to be amplified. PCR cloning, however, 
has the advantage of being much faster and easier than standard cloning methods. 
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